Skip to content

fix: support SSH agent and passphrase-protected keys#10

Merged
mcheemaa merged 1 commit intoghostwright:mainfrom
CaddyGlow:fix/ssh-agent-passphrase
Apr 5, 2026
Merged

fix: support SSH agent and passphrase-protected keys#10
mcheemaa merged 1 commit intoghostwright:mainfrom
CaddyGlow:fix/ssh-agent-passphrase

Conversation

@CaddyGlow
Copy link
Copy Markdown
Contributor

Summary

  • SSH agent support: SSHConnect now tries SSH_AUTH_SOCK first, so passphrase-protected keys loaded in the agent work out of the box
  • Graceful fallback: Unprotected key files (id_ed25519, id_rsa) are still used as a fallback
  • Debug logging: WaitForSSH now logs each connection attempt and failure reason for easier troubleshooting

Problem

SSHConnect called ssh.ParsePrivateKey directly on key files, which fails immediately with "this private key is passphrase protected" for any user whose default SSH key has a passphrase. This caused WaitForSSH to retry indefinitely with the same error.

Copilot AI review requested due to automatic review settings March 31, 2026 13:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves SSH connectivity to Hetzner-provisioned hosts by supporting passphrase-protected SSH keys via ssh-agent, preserving unprotected key-file fallback, and adding per-attempt connection logging in the SSH wait loop.

Changes:

  • Add SSH agent (SSH_AUTH_SOCK) support as the first authentication option in SSHConnect.
  • Keep fallback to parsing unprotected ~/.ssh/id_ed25519 / ~/.ssh/id_rsa key files.
  • Add attempt-by-attempt logging (including failure reasons) to WaitForSSH.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +304 to +307
signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
continue // passphrase-protected, skip — agent handles these
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback key parsing currently continues on any ssh.ParsePrivateKey error, not just the expected “passphrase missing” case. This can hide real problems (corrupted key file, unsupported format, etc.) and lead to a confusing “no SSH auth available” error. Consider only skipping when the error is a passphrase-missing/encrypted-key error (e.g., via errors.As), and otherwise propagate or include the parse error in the returned error when no auth methods are usable.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines +331 to +348
attempt := 0
log.Printf("[ssh] waiting for SSH on %s", ip)
for {
select {
case <-ctx.Done():
log.Printf("[ssh] context cancelled after %d attempts for %s: %v", attempt, ip, ctx.Err())
return nil, ctx.Err()
default:
}

attempt++
log.Printf("[ssh] attempt %d connecting to %s:22", attempt, ip)
client, err := SSHConnect(ip)
if err == nil {
log.Printf("[ssh] connected to %s after %d attempts", ip, attempt)
return client, nil
}
log.Printf("[ssh] attempt %d failed: %v", attempt, err)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WaitForSSH now logs every attempt using the standard library log package. This repository doesn’t appear to use log elsewhere, and WaitForSSH is called from Bubble Tea TUI flows (e.g., internal/tui/image_build.go / deploy_progress.go), where writing directly to stderr/stdout can corrupt the terminal UI. Consider moving these logs to the TUI layer, making logging optional via an injected logger, or gating it behind an env/config debug flag.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines 297 to 314
home, err := os.UserHomeDir()
if err == nil {
for _, name := range []string{"id_ed25519", "id_rsa"} {
keyBytes, err := os.ReadFile(filepath.Join(home, ".ssh", name))
if err != nil {
continue
}
signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
continue // passphrase-protected, skip — agent handles these
}
authMethods = append(authMethods, ssh.PublicKeys(signer))
}
}

signer, err := ssh.ParsePrivateKey(keyBytes)
if err != nil {
return nil, fmt.Errorf("error parsing SSH key: %w", err)
if len(authMethods) == 0 {
return nil, fmt.Errorf("no SSH auth available: set SSH_AUTH_SOCK or provide an unprotected key at ~/.ssh/id_ed25519 or ~/.ssh/id_rsa")
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.UserHomeDir() errors are currently ignored (the code just skips key-file fallback). If the agent also isn’t available, the function returns the generic “no SSH auth available … ~/.ssh/…” message, which can be misleading when the real issue is that the home directory couldn’t be determined. Consider returning/wrapping the UserHomeDir error when it prevents all key-file discovery, or at least include it in the final error when authMethods is empty.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

SSHConnect failed with "this private key is passphrase protected" for
any user whose default key has a passphrase. Now tries SSH agent first
(via SSH_AUTH_SOCK), then falls back to raw key files for unprotected
keys.

- Only skip passphrase-protected keys (PassphraseMissingError), surface
  other parse errors (corrupted key, unsupported format)
- Surface agent dial errors and UserHomeDir errors in diagnostics when
  no auth methods are available
- Close agent socket on dial failure to prevent FD leaks in retry loop
@CaddyGlow CaddyGlow force-pushed the fix/ssh-agent-passphrase branch from e396525 to ec9a176 Compare March 31, 2026 15:23
@CaddyGlow
Copy link
Copy Markdown
Contributor Author

Fixed all suggestions from Copilot review:

  • Agent socket leak: agent connection is now closed on dial failure to prevent FD leaks in the retry loop
  • Swallowing parse errors: only PassphraseMissingError is skipped; other parse errors (corrupted key, unsupported format) are surfaced in diagnostics
  • Silent agent dial failure: agent dial errors and UserHomeDir errors are now included in the final error message when no auth methods are available
  • Log output corrupting TUI: removed all log.Printf calls — the improved error messages in SSHConnect already surface the root cause

@mcheemaa mcheemaa self-requested a review April 5, 2026 03:01
Copy link
Copy Markdown
Member

@mcheemaa mcheemaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @CaddyGlow! Great contribution. The SSH agent support is a real fix for anyone with passphrase-protected keys, and the diagnostic error collection is a nice touch.

I noticed a small agent connection leak on the success path (agentConn isn't closed after ssh.Dial succeeds), but I'll fix that in a follow-up commit rather than hold this up any longer.

This also partially addresses #11 since ssh-agent handles all loaded keys regardless of filename.

Merging!

@mcheemaa mcheemaa merged commit c4d5392 into ghostwright:main Apr 5, 2026
2 checks passed
mcheemaa added a commit that referenced this pull request Apr 5, 2026
Go's x/crypto/ssh library fails the entire publickey auth chain when
PublicKeysCallback returns zero signers from an empty agent. This
breaks SSH for macOS users where SSH_AUTH_SOCK is always set but no
keys are explicitly loaded in the agent.

Now checks agent.Signers() before adding the agent as an auth method.
Also closes the agent connection immediately after use (fixes FD leak
from #10) and removes em dash from comment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants